Okay. It's just about 1 o'clock, so I guess I'll get started. As you can probably tell
from the title of this talk, it's about cloud storage. I'm going to be covering an API level
design vulnerability in a few of the different cloud systems. So I want to do a quick introduction
to that. My name is Zach. I'm a student at the University of Waterloo. Like many of you
guys here, I've had an interest in computer security and applied security for a very long
time. And this is my second DEF CON and the first time I'm speaking at a DEF CON or
any conference bigger than about 20 people, so.
Thanks. Hopefully I'll get that same response
afterwards, too. What remains to be seen. So I'm giving a talk on cloud storage. When
I was ‑‑ before this talk I was doing a little bit of recon. I was speaking to some
of my friends, trying to find out what it is that they use cloud storage for. And so
a lot of them use it as a sort of a USB key replacement. They use it to share large 10
megabyte files or larger with friends. Or they use it for back ups of their documents. Or
or they use it for availability and accessibility beyond ‑‑ across several devices. Really,
for the most part, it replaces USB keys. And a lot of them still treat cloud storage systems
as ‑‑ they're the same way they treat USB keys. They treat it as a large container
that they just throw files into until they run out of space and then delete a few to
free up a little bit of space afterwards. But one of the cool things about cloud storage
systems is they've got many more features than just space providing. So I have a little
chart here. I don't know if you can see it. But it speaks about some of the additional
mechanisms that these cloud storage providers have, like history or backup retention or
things like that. And that's really what we're targeting with this.
So the vulnerability, the main discussion I want to have with this is the idea that
treating files as blocks filling up a larger box doesn't quite represent cloud storage
when you have this time dimension. So if we try to reframe that previous picture with
a
.
as a space time graph, we can ‑‑ as a Gantt chart, really, when we're adding files,
we have different time intervals that we're adding them. And then by removing files, we
can see that the lifespan of these files stops existing after a certain amount of time. And
then with this kind of representation, we can think about the amount of space we're
using as sort of a sliding bar. So at any given time, we are occupying a different amount
of space. So this gives us an interesting sort of mechanism with which we can recover
previously deleted files. So really what we're talking about is that a lot of these
cloud systems have a size limitation for their quota management system but have a time duration
system for their history backup retention. So when you have these two different independent
quota management dimensions, you really have unlimited storage because you can exploit
‑‑ you know, you can exploit history retention to get additional amounts of space.
So really we're limited by our upload provider bandwidth rather than the upper limits we
have with the existing cloud.
So what this tool does is when we're doing an upload of a large file, we take
a large file and we cut it up into several smaller fragments and load these fragments
as different versions of some arbitrarily new file. And then we top it all off with
a chunk of zero size. This way our quota accounting mechanisms see this as a zero ‑‑ they
actually see this as a zero size file despite having that history backup.
So retrieval is very easy if we use this process. All we have to do is pull all the
versions and glue them back together with cat.
So going back to this storage time graph I was working with earlier, I used this to represent
a file earlier, but really what we can really treat it as is more like this, where we have
different versions of this file that together create that original file but are ‑‑ occupy
considerably smaller amounts of space in existence. So, you know, our account use is actually
closer to zero when we're looking at it from a different time.
So it's a fairly easy idea, so I rolled it into a tool for you guys. I call this tool
DPAC chopper, you know, running with this whole cloud environment thing. What it does
is it chops up files and then packs them and then DPACs them afterwards. So it's a vertical
storage management framework. What it does is I've created a plugable storage framework
that allows you to abstract out the API implementation specifics of the individual cloud storage utilities.
From this, the tool also maintains a storage database, back end, for fragmentation ‑‑ for
maintaining the history of the fragmentate, maintaining the table of fragments, maintaining
the initial files that form these fragments, and also provides a combined line access or
interface tool, the core functionality of these individual components. So I can talk
all day up here, but you guys really want to see a demo, right?
Right? All right.
All right.
Let's see. So, yeah. A little bit of resolution problems there. Is that better? Okay. So what
I'm starting here is I don't have anything in this directory. Just showing you that there's
no ‑‑ nothing up my sleeves. And I'm creating a 64 megabyte file that I'm going
to upload to this service. Here's the checksum of it. Just saving that behind. And then,
let's upload it. One of the things I'm doing here as a sort of a way to ‑‑ you know,
one of the things I'm trying to demonstrate here is that there are ways of circumventing
existing detection mechanisms for this kind of a thing. So what I'm doing here, and you
can see this here, is that the file size for the individual fragments is around about
512K. Plus or minus 5%. It's a normal distribution. Try and get around any sort of mechanisms
in place to detect continual overwrites in the same thing.
Now, I'll get into this a little bit later. There's a bunch of different techniques you
can use to mask. We're done with that. I'm going to show you a little bit more about
doing this. But for now, this has been demonstrated fairly well. This is ‑‑ this information
is generated by the DPAC tool itself. It's showing you the individual chunks that belong
to this file as well as the file size per upload. I'm going to use this to compare
later on when I've got the information I'm getting back from the server. This is all
locally generated information. So we're just about finished. Yeah, you can see the second
last file there is about 200K, just to top it all off. And then the last one is zero
size. And you can see, I've gone back into this folder, this checksum I use here, the
checksum I use here to act as the handle on the existing framework takes up zero size.
So back to where we were. Now I've deleted that binary and I'm busy reconstructing the
file from the fragments I'm getting back from the server. So these chunk numbers you see
here are the server ‑‑ are the ‑‑ is the information provided by the rest of the
API that gives us the mapping to those individual chunks we were looking at earlier. If you
compare this list with the list we had earlier, you'll see a one‑to‑one mapping of the
file size of getting back here and the file sizes we sent. Yeah. This is specific to Dropbox
in this example. But there's no reason it can't be extended to other cloud storage providers.
So I finished downloading it. It exists there. And you can see that the checksums match.
So we can actually use this for storage.
The tool and the form that I used there is available on the CDs you guys are getting
as part of the packages here. But I will also have the updated version of the code on GitHub
at this link. You can bug me for it afterwards. And what I like about this tool kit, and
one of the reasons I wrote it in Python, is to give us the extensibility for hiding
from these detection mechanisms. So, for example, we can maintain our own deltas to map to
real changes.
So we can make changes in the file size ‑‑ file information rather than our faking it
through the API here. We can also do a sort of adaptive mangling, use different file names.
Right now this tool just uploads with the git hash and uses that as the anchor point
in the cloud storage system. But there's no reason we have to use that.
So the future work I want to cover is extending the CLI. Right now it just supports Git input.
But, you know, it's fairly ‑‑ fairly simple functionality to continue working on
there. I also want to get some more modules done. I looked at some other cloud storage
projects, just two or three, that have some mechanisms placed to defeat this but aren't
particularly rigorous themselves. So really only Dropbox works at this stage, but we can
work on that. Right, guys?
I also want to do some more tunable options so that we can look at different ways of
automating the process of generating the file fragments. In this case I used a generator
to generate 512K chunks with a normal distribution, but there's no reason we can't move it across
a whole bunch of different things. I had to overwrite one file, but there's no reason
we can't move to multiple files.
There's a whole bunch of different ways we can take this depending on any sort of tunable
objects we want to use. So this wouldn't be a security talk without
the implications of this kind of a vulnerability. So if we look at the blue team concerns for
this, it's fairly straightforward to detect this by looking at the constant file size
writing and the time you're starting and the difference between the Delta uploads.
But we can deal with this with generators by introducing subtle variations in the delay
of the uploads of the different versions of these files. We can also vary the name, we
can also vary the file size. And that's something we can counteract their initial response to
this thing.
Secondly, it's fairly straightforward to ban an API key. But, again, with the extensibility
we can request a new one. There's not going to limit the API or the available tools we
can create just because of one or two bad eggs. Thirdly, the one thing that is fairly
evident is the null caps, those zero size fragments that are right at the end of the
files. They make them take up no space. They're not going to take up as much space as they
did in the internal metrics. They kind of ‑‑ that's a fairly obvious signature. So we can
really replace that by using something very small, like a one‑byte file. Which, again,
by moving to one‑byte, we don't have unlimited space anymore. But with a two‑gigabyte storage,
we can still store two billion files like this. One of the reasons this is a major concern
to these companies is the fact that having unlimited space really undermines our business
model. You know, they have this whole drug dealer, the first bits free kind of thing.
And that's ‑‑ I don't know. I don't know. I don't know. I don't know. I don't know.
I don't know. I don't know. I don't know. I don't know. I don't know. I don't know. I don't know.
I don't know. I don't know. I don't know. I don't know. I don't know. I don't know.
Getting unlimited storage really breaks their financial incentive for these kinds of things.
Secondly, by going the opposite way, if they break large binary rights, it will really
damage a lot of the existing tools that use Dropbox or any cloud storage system already.
For example, I use AnchFs into Dropbox. That does a lot of binary modifications again and
again. That will probably trigger very similar to the Deepak tool.
Finally, I know that we've discussed this several times at various talks about Prism
and everything, but deep file analysis is really time‑consuming and frowned upon.
But really it's more time‑consuming than it is problematic for themselves. So that's
something that we can use to get around that. So I got through everything I wanted to say
in about 11 minutes. So I just want to do some special thanks to some of my friends
who helped me get to this stage, who encouraged me to do this. And, yeah, that's all I have
to say. Enjoy your lunches.
Move, move, move, move! You're still speaking.
No, I'm not. Yes, you are.
Do not enjoy your lunches.
Oh, yeah, this is a fun conference. I forgot about that up here.
What do we call this? Shot the noob. Thank you.
Why are we doing this?
We shot the noob. First‑time speaker. What else do we need?
There, right there. Someone's first time at DEF CON. First time at DEF CON, sir? All right.
All right. Come on up. She was sitting next to him. So is this your
girlfriend?
Wife.
Wife. All right. Congratulations. All right. Here we go. It is very hard to be chosen
to speak at DEF CON. Very competitive.
So a big round of applause for our first‑time speaker.
Thank you. Thanks.
All right. Thanks a lot.
Okay. Now you can ‑‑
Now you can say you're done.
Okay. I'm done. Thank you.
